Intelligent model for the detection and classification of encrypted network traffic in cloud infrastructure

This article explores detecting and categorizing network traffic data using machine-learning (ML) methods, specifically focusing on the Domain Name Server (DNS) protocol. DNS has long been susceptible to various security flaws, frequently exploited over time, making DNS abuse a major concern in cybersecurity. Despite advanced attack, tactics employed by attackers to steal data in real-time, ensuring security and privacy for DNS queries and answers remains challenging. The evolving landscape of internet services has allowed attackers to launch cyber-attacks on computer networks. However, implementing Secure Socket Layer (SSL)-encrypted Hyper Text Transfer Protocol (HTTP) transmission, known as HTTPS, has significantly reduced DNS-based assaults. To further enhance security and mitigate threats like man-in-the-middle attacks, the security community has developed the concept of DNS over HTTPS (DoH). DoH aims to combat the eavesdropping and tampering of DNS data during communication. This study employs a ML-based classification approach on a dataset for traffic analysis. The AdaBoost model effectively classified Malicious and Non-DoH traffic, with accuracies of 75% and 73% for DoH traffic. The support vector classification model with a Radial Basis Function (SVC-RBF) achieved a 76% accuracy in classifying between malicious and non-DoH traffic. The quadratic discriminant analysis (QDA) model achieved 99% accuracy in classifying malicious traffic and 98% in classifying non-DoH traffic.

network effectively, it is critical to tell the difference between hazardous and benign data.It is crucial for private networks and the Internet to keep their DNS systems safe from intrusion by unwanted parties.Since hackers exploit advanced strategies to outbreak DNS requests and responses, a covert channel is used to encrypt DNS transfers and queries by establishing a connection with DNS using the HTTPS protocol.Man-in-the-middle attacks are difficult to defend against with this method since they improve privacy and address DNS weaknesses (Banadaki, 2020;Wazan & Cuppens, 2023).
An intrusion detection system (IDS) monitors internet-connected device traffic and detects DoH traffic assaults in network topology by detecting intrusions.Intrusion detection is established by monitoring and analyzing events happening in a computer system or network (Larsen, Pahl & Coatrieux, 2023).The events depend upon the availability or circumvent security safeguards, integrity and efforts to compromise confidentiality.An intrusion detection system (IDS) is your best line of protection against today's more sophisticated and widespread network assaults.Malicious traffic may be detected and distinguished from legitimate communication using various intrusion detection systems (IDS).Algorithms like naive Bayes, neural network regression, and support vector machines have been used to identify attacks, including principal component analysis, random forest (RF), and support vector machines (Jafar et al., 2021;Vries, 2021).
These methods may test and analyze DoH traffic in covert channels and tunnels.A systematic technique is presented here to evaluate the capabilities of various machinelearning algorithms.This study aims to identify and classify DoH traffic and discriminate between benign and malicious DoH traffic using time-series classifiers in a two-layered ML technique (Raikar et al., 2020).The application of DoH protocol in an application employing four servers and five dissimilar browsers and software applications to record non-DoH, malicious-DoH and benign-DoH traffic is the part of the dataset according to CIC's current version of their dataset (Banadaki, 2020).Layer one is used to differentiate non-DoH and DoH traffic, while layer two is used to differentiate malicious DoH and benign traffic.Numerous ML methods are being tested to classify between non-DoH and DoH traffic, and in the same way malicious and benign traffic (Khan, Raza & Hwang, 2022;Singh et al., 2022;Singh & Roy, 2020).
In the context of Software Defined Networks (SDN), various approaches exist for detecting DNS tunnels, such as statistical analysis of DNS packets and domain name analysis.These techniques often involve using statistical models to identify anomalous domain names.Indicators of DNS tunnels include DNS resolution frequency, subdomain length, and the presence of TXT records.Strategies like block listing domains, blocking IP addresses, and removing suspicious DNS packets can be employed to mitigate DNS tunneling.SDN, a concept revolutionizing network architecture, plays a crucial role in DNS operations (Jafarian et al., 2021).DNS serves as the backbone of the Internet, translating human-readable hostnames into computer-understandable IP addresses.The development of the DNS protocol followed a decentralized hierarchical approach.When a DNS client initiates a query for an IP address, the local DNS server responds by checking its cache.The query is forwarded to a recursive DNS resolver if the response is not found in the cache.This resolver then iteratively requests information from authoritative name servers, top-level domain (TLD) name servers, and eventually the root name server, until it obtains the authoritative response.DNS tunneling is a technique that leverages the DNS protocol to encapsulate data communication between a client and a server.In this method, data is encoded within the DNS response records of a typical DNS request, and the server may or may not reply with encoded data.By integrating SDN principles into DNS operations, network administrators gain greater control and visibility over DNS traffic.SDN enables centralized management and programmability of network resources, facilitating the implementation of advanced security measures to detect and prevent DNS tunneling attacks.
Capturing DoH and non-DoH traffic is accomplished using a two-layered technique.Browsers that support DoH protocol and DNS tunneling tools are used to visit the top 10,000 Alexa websites and create HTTPS (both benign and malign DoH traffic) and DoH traffic for the representative dataset.A statistical characteristics classifier divides the collected traffic into two categories: DoH and non-DoH.DoH traffic is classified as either benign or malicious at the second layer using a time-series classifier.Accessing a website using the HTTPS protocol generates traffic designated as non-DoH.Many Alexa domain websites are visited to ensure the dataset is well balanced.'Benign-DoH' is non-malicious DoH traffic created using the same method as in 'non-DoH' by utilizing the Mozilla Firefox and Google Chrome web browsers'.This is known as malicious DoH traffic and is generated by DNS tunneling software such as dns2tcp, DNSCat2, and Iodine.Using these tools, you may transmit TCP traffic as DNS queries.These programs build encrypted data tunnels, to put it another way.As a result, DNS queries are forwarded to dedicated DoH servers through HTTPS requests encrypting the traffic using TLS (Belel, Dutta & Mukhopadhyay, 2023).Using web browsers, we can simulate normal online behavior, such as utilizing HTTPS and benign DoH.To put it another way, malicious DoH is created using a combination of DoH tunnel-building tools (Khan et al., 2022).This technology's traffic is logged and used to train the classifiers, as shown in Fig. 1 (MontazeriShatoori et al., 2020).
The researcher explore the application of cloud-based semi-static secure accountable authority identity-based broadcast encryption featuring public traceability without random oracles, in the context of network traffic data detection and categorization using ML methods (Singh, Acharya & Dutta, 2023).The Domain Name Server (DNS), one of the earliest and most vulnerable network protocols, presents numerous security flaws that have been frequently exploited over time, creating significant concern in the realm of cybersecurity.Despite the implementation of sophisticated attack strategies by cyber criminals to pilfer data surreptitiously, ensuring the security and privacy of DNS queries and responses remains a complex task.The ever-evolving landscape of internet services has inadvertently provided a broad playing field for such cyber-attacks on computer networks.They focus on leveraging cloud support to enhance the effectiveness of MLbased classification in network traffic data detection and categorization (Mohamed et al., 2021).The intent is to further fortify the security of DNS communications and mitigate the risk of cyber-attacks, thereby improving the overall security architecture of computer networks.
This article makes three contributions: Firstly, a ML model to differentiate DoH traffic from non-DoH traffic at layer 1.We provide a unique two-layered technique that characterizes DoH traffic at layer 2. Secondly, a labeled dataset may be generated in the network premises by collecting non-DoH encrypted traffic, malicious-DoH and benign-DoH traffic.Thirdly, introducing the notion of packet clumps and illustrating the efficiency of this feature set in encrypted traffic characterization by proposing a new feature set based on time-series representation of traffic flows (Srivastava et al., 2022).
This research makes several unique contributions for detection and classification of DoH network traffic with the application of ML techniques.Firstly, it proposes a novel two-layered classification approach for analyzing DoH communications in depth.At layer one; a statistical characteristic classifier is developed to differentiate DoH traffic from non-DoH traffic.Subsequently, layer two involves classifying the DoH traffic as either benign or malicious using time-series models.Secondly, to facilitate rigorous evaluation of various machine-learning (ML) algorithms, an extensive labeled dataset is carefully generated by collecting samples of benign-DoH, malicious-DoH and non-DoH traffic within a network testbed set-up involving multiple browsers and servers.This provides a robust and representative dataset for comparative assessment.Thirdly, the study introduces the concept of packet clumps as a new feature engineering approach for encrypted traffic analysis.By extracting time-series representations based on packet clump characteristics, this feature set is shown to enhance the effectiveness of ML classifiers for the encryption Full-size  DOI: 10.7717/peerj-cs.2027/fig- 1 traffic detection task.Hence, this research advances the state-of-the-art through scientific contributions in multiple dimensions, ranging from a novel classification framework to generation of a benchmark dataset and proposal of improved learning features.The rigorous methodological approach and well-defined contributions allow meaningful evaluation and comparison of ML schemes for DoH network traffic identification and segmentation.

LITERATURE REVIEW
Algorithms LGBM and XGBoost surpass the competition in almost all classification measures, achieving classification task accuracy of 100 percent in layers 1 and 2 (Banadaki, 2020).Source IP was the most important feature for differentiating non-Doh traffic and DoH traffic in layer one, followed by the Destination IP feature, out of 34 characteristics taken from the CIRA-CIC-DoHBrw-2020 dataset.
LGBM and gradient boosting techniques use just Destination IP to distinguish benign and malicious data in layer 2 (Alarfaj et al., 2022;Banadaki, 2020).DNS is a critical component of the Internet's infrastructure.DNS's main job is to map IP addresses to domain names and send users to the relevant computers, programs, and files (Niakanlahiji et al., 2023;Zang et al., 2023).
Because of DNS's security weaknesses, it is always a prime target for cybercriminals.An attempt to identify fraudulent DNS activity is made using several machine-learning classifiers, including random forest (RF), K-nearest neighbor (KNN), and gradient boosting (GB) (Hadwan et al., 2022;Shiomoto, Otoshi & Murata, 2023;Singh & Roy, 2020;Ullah, Jabbar & Al-Turjman, 2020).DNS over HTTPS (DoH) improves internet security while enhancing user privacy.DoH, on the other hand, makes it more difficult for network managers to maintain the security of their systems.Because DoH traffic looks like normal HTTPS traffic, it is difficult to tell apart (Khodjaeva & Zincir-Heywood, 2021).DoH network traffic may be distinguished from non-DoH network traffic using many criteria examined in depth in this article (Vries, 2021).DNS is one of the most critical pieces of Internet infrastructure (Mitsuhashi et al., 2021).The proposed scheme's simulation results suggest that it can distinguish between malicious, benign, and non-DoH classes with a 99 percent accuracy.Many academics have investigated various ML strategies to meet this problem (Waqas et al., 2022).This research presents a systematic technique for recognizing malicious and encrypted DNS requests by monitoring network traffic and determining statistical features (AlQaralleh et al., 2022;Jafar et al., 2021).
The author then adds to these qualities by estimating the flow's entropy in several methods.Using publicly accessible datasets, the author compares and contrasts five ML classifiers: Decision Tree (DT), RF, Logistic Regression, Support Vector Machine, and Naive Bayes (Khodjaeva & Zincir-Heywood, 2021).Providing improved protection against attacks is becoming more important as the worldwide reach of the Internet of Things (IoT) networks expands annually (Deebak et al., 2022;Tu et al., 2021b;Wang et al., 2021).Cyberattacks may be mitigated most effectively using an IDS (Lehniger, Saad & Langendörfer, 2022).A hybrid lightweight IDS is proposed in this study based on data gathered from IoT networks (Althobaiti et al., 2022;Sarkar et al., 2022;Ullah et al., 2021Ullah et al., , 2020)).When dealing with a vast dataset, XCNN and RCNN are 1,000 times quicker than KNN.XCNN took 86.18% less time to compute than KNN, but RCNN took 91.74% more.This benefit allows for more latitude in IDS site selection (Tu et al., 2021a).As a result of our IDS' minimal training requirements, response times to zero-day assaults are cut in half (Alassaf & Sikkandar, 2022;Liu et al., 2021).

Cloud-based apps security model
Safe instant messaging (IM) protocols should please the broad security areas of confidentiality, reliability, authenticity, and integrity.Few even guarantee cutting-edge security goals like future secrecy (Lyu, Gharakheili & Sivaraman, 2022).Automatically, a secure and sound communication protocol should deliver a neck and neck of security equivalent to interpersonal communication in a safe area.Both in the room overhear the communication, both recognize who spoke and how frequently words have been spoken, and no one outdoor the apartment can either say towards the room or listen to the conversation inside, and the door of the apartment is unlocked only for asked peoples (Sun et al., 2022;Zhang, He & He, 2023).

Notations and assumptions
In reality, the IM protocols are centralized.All communicated messages are communicated through a centralized server that receives messages from the individual senders, stores them, and forwards them as soon as the receivers are online.That is why the protocols are performed in an asynchronous atmosphere in which only the server remains online, as shown in Fig. 2. The algorithms first handle the message and then the result is delivered to the end-user.The notions and terms used for cloud-based security model are shown in Table 1.
Let us define a message as a tuple.Here is the finite set of user protocols.ID u is the set of User IDs containing the username, user contact number, last seen if it is visible to all, and user bio if its setting is set to be public.If there is already communication done, then we can also have it U GiC ; U mld .
The user is uniquely referenced on a central server and contains and Misc.We have donated encrypted communication as C 1 ; C 2 ; . . .; C n 2 l.Every user on the communication network maintains the long-term secrets of starting communication with other users and session states.Messages delivered to an end-user are not saved in the state of a session.By differentiating the delivery of the messages and receiving, we need to highlight that the algorithms first handle the message received and then the result is shown to the end-user.

Asymmetric key encryption scheme
We have proposed an asymmetric-key encryption scheme.The scheme is used for encryption purposes to hold privacy with the generated session key on Simple Matrix for the security of the message.The resulting representations are used for instant messaging (IM) to show the scheme using asymmetric key encryption, as shown in Table 2.
We have summarized the detailed information on the encryption scheme, which we have outlined in Fig. 2. The encryption process is unpretentious.To encrypt a message, the public P must compute u ¼ PðjÞ 2 F m .This process is done by using the polynomial evaluations.Since the security of the encryption scheme is based on solving quadratic equations to decrypt a cipher text you ∈.We have summarized the encryption and decryption process in Fig. 3.We can generate two linear maps, G and H in F, and the private key with matrices Y and Z.After this process, the private keys are used to calculate the public key in the form of an F. Therefore, it is compulsory to complete the abovementioned three steps to complete the process.First, it d ¼ G ðÀ1Þ ðuÞ is computed as shown in Eq. ( 1).
G 0 is a matrix of q × q.Secondly, cðc 1 ; c 2 ; …c n Þ ¼ M ðÀIÞ ðdÞ it is required to be computed.We suppose that and denote the matrices in the following forms.We have to calculate the inverse of E 0 q1 , i.e., if invertible.We have computed j ¼ H À1 c ð Þ in the following form as shown in Eq. ( 2) by constructing the variables c 1 ; c 2 ; …c i .If none of E 0 q1 ; E 0 q2 and X 0 is invertible then decryption process fails.We have constructed i linear equations on i variables c 1 ; c 2 ; …c i based on X 0À1 Â E 0 q 1 À Y ¼ 0, and We have unraveled the equations on variables c 1 ; c 2 ; e., X 0À1 .We have constructed i linear equations on i variables T 0 is a matrix of i × i.The asymmetric-key encryption scheme can retain confidentiality after that, the plaintext j has been calculated.

Signature generation scheme for public key
Table 3 represents the public-key signature generation scheme with the given notations.
We have summarized the detailed information on the signature scheme.Three transformations K G , M and K H in F are used to generate the private key.The private keys are used to calculate public key, i.e., P ¼ K G M K H in F. The quadratic equations in F are used in the signature scheme for the security of data.In order to sign an encrypt message x x 0 ; x 1 ; …; We require to calculate the hash value of the message to solve (Eq.( 3)) by using a SHA-256 based hash function.
Third, the central map transformation M is calculated using Eq. ( 5).
Fourth, affine transformation matrix K H is calculated based on the calculation outcome of Eq. ( 6).
The signature z z 0 ; z 1 ; …; z jÀ1 À Á & F, we have to calculate the Eq. ( 8) to verify the process of the signature verification, which is simple.Finally, we generate the signature z.
x 00 ðx 0 00 ; x 1 00 ; …::; If x 00 = x 0 , then the signature is acceptable by comparing x 00 with the hash value of the original message x 0 .In other case the case is rejected.(Liu et al., 2021).When it comes to the DoH protocol, malevolent actors may utilize it to build covert channels in several ways.We name this DoH tunneling network traffic "malicious," as shown in Fig. 5.
To facilitate reproducibility and bolster scientific rigor, additional specifics are warranted regarding model evaluation protocols and implementation details.The selection of accuracy, AUC, confusion matrices and other metrics presented herein stem from recommended best practices for multi-class traffic classification tasks.Furthermore, while baseline default parameters suffice initially, model optimization via tuning of key hyper parameters (e.g.kernel type, regularization, ensembling parameters etc.) can yield substantial improvements.Therefore, the model training process undertaken involves systematic grid search over viable hyper parameter ranges for each algorithm.The optimal configurations obtained after sweeping through hundreds of combination yielding the highest cross-validation performance are finally locked in.Such iterative tuning of model knobs to find the ideal operating point that generalizes well allows us to maximize effectiveness.By elucidating factors behind metric choices for model selection, specifying tuning heuristics adopted, the research process is rendered more transparent.Augmenting these fine-grained specifics bolsters methodological rigor and aids reproducibility by qualitatively articulating a structured approach to optimizing ML pipeline performance through evidence-driven customization of learning schemes presented.
Dataset detail and pre-processing Among the earliest and most susceptible network protocols, the DNS has repeatedly exploited several security flaws over the years.In cybersecurity, DNS abuse has long been a major source of worry.Although attackers utilize advanced attack tactics to steal data on the fly, ensuring security and privacy for DNS queries and answers is still a difficult challenge to do (Gopi et al., 2022).IETF established DNS over HTTPS (DoH) in RFC8484 to address some DNS privacy and data manipulation issues.DoH encrypts DNS queries and sends them over an encrypted covert channel/tunnel, ensuring that data is not harmed in transit.However, the lack of a representative dataset makes evaluating the methods for capturing DoH traffic in a network architecture difficult.DoH traffic through covert channels and tunnels that are studied, tested, and evaluated using a systematic manner proposed in this study.In order to identify and analyze DoH traffic using a time-series classifier, this research aims to install DoH inside an application and capture both benign and malicious DoH traffic.Data were collected as previously described in Abid et al. (2023).
Using five different browsers and tools and four servers, the final dataset comprises DoH protocol implementation in an application that captures benign-DoH, malicious-DoH, and non-DoH traffic.On the first tier of the two-layered technique described, DoH communication is classified as either benign or malicious depending on whether it comes from a DoH device.Search engines like Google Chrome employ many different methods to collect traffic, such as DNSCat2, DNSCat3, and Iodine, while servers like Cloudflare and Google DNS reply to DoH requests using AdGuard and Cloudflare respectively.Initially, the dataset is pre-processed by encoding the source and destination IP addresses and time stamp values using an ordinal encoder.The NA values are dropped, as shown in Algorithm 1.

RESULTS AND DISCUSSION
The K-nearest neighbors with the value of k is four is used; hence, the total number of classes is four.The dataset is divided into two parts, i.e., training and testing parts.The training part consists of 67% of the data, and 33% of data is carried for the testing dataset.The overall accuracy of the model is 75%, which is not very good, but it can be increased during the full experiments.The results for NonDoH are very good, but for DoH, benign and malicious are very promising.The precision is good for benign, while recall and F1-score are better for DoH, as shown in Table 4 and Fig. 6.
A comparative analysis between this article and some of the key references on malicious DNS traffic detection using ML techniques is summarized in Table 5.As shown in the table, while existing literature has explored related problems, this article makes several key contributions in terms of the dataset diversity, proposed methodology, and rigorous ML pipeline evaluation as well as classification performance.The key differentiation of this  two-layered classification approach leveraging time series features is highlighted across various comparative aspects against prior art.Eight different ML models are trained in four classes.The classes predicted by some models are very distant, while others got false positive and false-negative results.The overall results in a confusion matrix are shown in Fig. 7.The Ada Boost, DT classified malicious, and non-DoH without any confusion.While other models also classified these classes with much better accuracy, except for SVC-RBF.The other classes by other models, i.e., DT, naïve Bayes (NB), nearest neighbors, neural network, QDA, RF, and SVC-RBF, have some problems.
Statistical measures summaries like per class accuracy, overall accuracy, macro average accuracy, and weighted accuracy obtained ML classifiers, i.e., AdaBoost, DT, NB, nearest neighbors, neural network, QDA, RF, SVC-RBF are shown in Fig. 8.
The area under curve obtained from ML classifiers, i.e., AdaBoost, QDA, and SVC-RBF, as shown in Fig. 9.The SVC-RBF model classified Malicious as 76% and non-DoH as 76%, benign got 13%, and DoH class got 13% accuracy.The macro accuracy is 44%, and the micro average accuracy is 47%.The QDA model classified malicious as 99% and non-DoH as 98%, benign got 78%, and DoH class got 77% accuracy.The macro accuracy is 88%, and the micro average accuracy is 91%.The AdaBoost model correctly classified malicious and We explore an important problem regarding classification of encrypted DNS traffic using ML, the specific research questions and knowledge gaps being addressed could be more clearly positioned.The authors should outline the precise real-world issues and limitations in existing methodologies that this work aims to tackle.For example, the introduction could highlight open questions around rigorously benchmarking complex ML algorithms for multi-class encrypted traffic analysis, and the lack of diversity in current DNS tunneling datasets.It can cite the dependency on standard corpora and single tunneling tools in prior approaches as an inherent limitation.Building on this problem framing, the novel contributions proposed-including the two-layered methodology, focus on time-series characterizations, and data collection strategy spanning browsers and tunneling tools-can be presented as targeted efforts to fill these gaps.By first discussing the specific open research questions on applying ML to DNS security, assessing alternatives, and articulating limitations therein, this work can concretely situate how their technical approach and results advance knowledge over documents in literature.The comparisons should emphasize dimensions such as model sophistication, dataset diversity, classification granularity etc. as differentiators to strengthen claims around addressed knowledge gaps.Enhancing this contextual framing of research issues, current shortcomings, and targeted improvements will help accentuate the significance of innovations introduced by the authors in the ML pipeline for encrypted DNS traffic analytics.

CONCLUSIONS
Computer networks have become simple targets for cyber-attacks in the ever-changing internet services.DNS assaults have been greatly reduced because of HTTPS.DoH is used to help protect against Man in the Middle attacks by fighting eavesdropping and DNS data tampering during DNS communication.The attacker utilizes advanced attack tactics to steal data for DNS queries, and answers are still a difficult challenge.The network traffic This study aimed to investigate the application of ML techniques for detection and classification of DoH network traffic.Specifically, it sought to evaluate different models for identifying and distinguishing between benign, malicious and non-DoH communications within an encrypted traffic dataset.
The results demonstrate that the two-layered classification approach is highly effective at analyzing DoH traffic in depth.At layer one, the support vector classifier with RBF kernel achieved 76% accuracy in differentiating between malicious and non-DoH traffic.Meanwhile, at layer two, the QDA model attained classification rates of 99% and 98% for malicious traffic and non-DoH traffic respectively.The AdaBoost ensemble classifier also performed well, with accuracies of 75% and 73% for benign and DoH classes.
Notably, the time-series feature engineering based on packet clump representations enhanced encrypted traffic learnability.This validates the hypothesis that new learning representations tailored for HTTPS data payloads can improve detection quality.
In conclusion, the findings strongly support the research question by showing ML provides a viable solution for DoH network analysis.Classification performance often exceeded 90% for models trained on the custom dataset.This contributes significantly to knowledge by demonstrating ML is practical for encrypted DNS traffic understanding.Going forward, the two-layer framework and proposed feature set warrant further exploration on more extensive real-world DoH traffic corpora.With refinement, such techniques show promise for bolstering security and surveillance of encrypted network protocols.
As we look towards the future, it is clear that our work must continue to evolve alongside the complexities and variety of cyber threats that are also increasing.Despite the promising results that ML methods have demonstrated in the realm of network traffic data detection and categorization, the challenges posed by advanced attack tactics cannot be underestimated.Therefore, our next steps will involve several key areas of focus.We aim to improve multi-class classification by refining the SVC-RBF and QDA models that have shown good initial results.Our goal is to explore a wider range of ML and deep learning algorithms for this purpose, with the intent to achieve even higher accuracy levels in multiclass classification of network traffic data.In the context of binary classification, the superior training accuracy of the SVC-RBF, DT, and AdaBoost models in a 2-class scenario has pointed us towards a future validation of these models on different datasets.We will concentrate on enhancing the detection rate of malicious traffic while simultaneously minimizing both false positives and negatives.With the prevalence of advanced and dynamic attack tactics used by cyber criminals, it is paramount to develop ML models that are capable of learning and adapting to these tactics over time.Such an approach will help us maintain an edge over cyber threats, and ensure robust security for DNS queries.
Although DNS over HTTPS (DoH) has significantly reduced the frequency of DNS attacks, guaranteeing the security and privacy of DNS queries and responses remains a formidable challenge.Therefore, our future work will also focus on devising additional security measures to enhance the effectiveness of DoH.Recognizing the potential advantages of cloud technology, we plan to investigate cloud-based solutions for managing network traffic data.The scalability and distributed nature of the cloud could be harnessed to handle large-scale data more efficiently.

Figure 2
Figure 2 Summary of the syntax of IM protocols showing the cooperating user's interfaces and the boundaries of the application to the network.Full-size  DOI: 10.7717/peerj-cs.2027/fig-2

Figure 4 A
Figure 4 A secure communication system between two users in the cloud environment.Full-size  DOI: 10.7717/peerj-cs.2027/fig-4

Figure 5
Figure 5 Methodological framework for data capturing, analyzing and classifying (MontazeriShatoori et al., 2020).Full-size  DOI: 10.7717/peerj-cs.2027/fig-5 neighbors classify by majority vote of the K nearest samples in feature space ŷ ¼ modey i : i 2 N K ðxÞ neural network learn feature transformations f ðÞ and classification gðÞ by optimizing a loss function over parameters h: min h Lðy; gðf ðx; hÞÞÞ QDA Assume Gaussian distributions per class and find boundaries as shown in Eq. (

Table 1
Notation guide used in the cloud-based security model.

Table 3
Terms and notions used by signature generation scheme for public key.

Table 4
Statistical measures of the K-NN with K = 4.